Multiple Regression Methods Show Great Potential for Rare Variant Association Tests

نویسندگان

  • ChangJiang Xu
  • Martin Ladouceur
  • Zari Dastani
  • J. Brent Richards
  • Antonio Ciampi
  • Celia M. T. Greenwood
چکیده

The investigation of associations between rare genetic variants and diseases or phenotypes has two goals. Firstly, the identification of which genes or genomic regions are associated, and secondly, discrimination of associated variants from background noise within each region. Over the last few years, many new methods have been developed which associate genomic regions with phenotypes. However, classical methods for high-dimensional data have received little attention. Here we investigate whether several classical statistical methods for high-dimensional data: ridge regression (RR), principal components regression (PCR), partial least squares regression (PLS), a sparse version of PLS (SPLS), and the LASSO are able to detect associations with rare genetic variants. These approaches have been extensively used in statistics to identify the true associations in data sets containing many predictor variables. Using genetic variants identified in three genes that were Sanger sequenced in 1998 individuals, we simulated continuous phenotypes under several different models, and we show that these feature selection and feature extraction methods can substantially outperform several popular methods for rare variant analysis. Furthermore, these approaches can identify which variants are contributing most to the model fit, and therefore both goals of rare variant analysis can be achieved simultaneously with the use of regression regularization methods. These methods are briefly illustrated with an analysis of adiponectin levels and variants in the ADIPOQ gene.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

General framework for meta-analysis of rare variants in sequencing association studies.

We propose a general statistical framework for meta-analysis of gene- or region-based multimarker rare variant association tests in sequencing association studies. In genome-wide association studies, single-marker meta-analysis has been widely used to increase statistical power by combining results via regression coefficients and standard errors from different studies. In analysis of rare varia...

متن کامل

Assessing the effects of multiple markers in genetic association studies

It is believed that human diseases and their underlying causal pathways involve a complex interplay between multiple genetic and environmental risk factors. The search for relevant genetic variants, which typically have very small individual effects, is challenging. While typically genetic association studies involve testing for association between the phenotype and each individual genetic vari...

متن کامل

Variant association tools for quality control and analysis of large-scale sequence and genotyping array data.

Currently there is great interest in detecting associations between complex traits and rare variants. In this report, we describe Variant Association Tools (VAT) and the VAT pipeline, which implements best practices for rare-variant association studies. Highlights of VAT include variant-site and call-level quality control (QC), summary statistics, phenotype- and genotype-based sample selection,...

متن کامل

Testing genetic association with rare variants in admixed populations.

Recent studies suggest that rare variants play an important role in the etiology of many traits. Although a number of methods have been developed for genetic association analysis of rare variants, they all assume a relatively homogeneous population under study. Such an assumption may not be valid for samples collected from admixed populations such as African Americans and Hispanic Americans as ...

متن کامل

A Nonparametric Regression Approach to Control for Population Stratification in Rare Variant Association Studies

Recently, there is increasing interest to detect associations between rare variants and complex traits. Rare variant association studies usually need large sample sizes due to the rarity of the variants, and large sample sizes typically require combining information from different geographic locations within and across countries. Although several statistical methods have been developed to contr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 7  شماره 

صفحات  -

تاریخ انتشار 2012